Memory caching

ABSTRACT

There are disclosed apparatus and methods for achieving maximum data transfer. Memories and interfaces between the memories are provided. An actively determined number of data units having an actively determined unit size are transferred between the memories to provide the maximum data transfer.

RELATED APPLICATION INFORMATION

This patent is a continuation of U.S. application Ser. No. 10/039,953 filed Dec.-31-2001, which claims priority from U.S. application Ser. No. 09/930,804 filed Aug.-15-2001, both of which are incorporated by reference.

NOTICE OF COPYRIGHTS AND TRADE DRESS

A portion of the disclosure of this patent document contains material which is subject to copyright protection. This patent document may show and/or describe matter which is or may become trade dress of the owner. The copyright and trade dress owner has no objection to the facsimile reproduction by any one of the patent disclosure as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright and trade dress rights whatsoever.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention is related to First In First Out (FIFO) memories with caching.

2. Description of the Related Art

Communications networks now require handling of data at very high serial data rates. For example, 10 gigabits per second (Gbps) is common. When it is required to process at these speeds, high-speed data parallel connections are used to increase the effective bandwidth. This may be unsatisfactory because of the resultant decrease in bandwidth due to increased overhead requirements. There is a need for effective high speed switching apparatus and the associated hardware to support such an apparatus.

DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a memory provided in one embodiment of the present invention;

FIG. 2 is a diagrammatic portion of FIG. 1 illustrating its operation;

FIG. 3 is a flow chart of the operation of FIG. 1;

FIG. 4 shows one embodiment of a head and tail caching system constructed in accordance with the present invention;

FIG. 5 shows one embodiment of a controller for use with the head and tail caching system of FIG. 4;

FIG. 6 shows a diagram of transfer efficiency versus data block size;

FIG. 7 shows a diagram of how input data is grouped into blocks in accordance with the present invention;

FIG. 8 shows one embodiment of a flow diagram for operating a head and tail caching system in accordance with the present invention; and

FIG. 9 shows one embodiment of a head and tail caching system for processing multiple data streams in accordance with the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Throughout this description, the embodiments and examples shown should be considered as exemplars, rather than limitations on the apparatus and methods of the present invention.

As disclosed in a co-pending application entitled High Speed Channels Using Multiple Parallel Lower Speed Channels attorney docket 0679/13 switching of input data arriving at a relatively high data rate of, for example 10 Gbps, may be accomplished. As illustrated in FIG. 1 a plurality of switching elements SE0-SE7 which operate at a much lower data rate, for example 2.5 Gbps. By the use of a sequential or successive sprinkling technique for complete data packets, a high data rate may be maintained, for example, by providing for load balancing. Data packets arrive from a receiver 11 which would have a communications processor coupled to it on line 12 at 10 Gbps and via the variable FIFO memory illustrated at 13, FIFO being First In First Out memory. Data packets are routed to a sequential sprinkler engine 14 and then distributed at the lower data rate to various switching elements. In general, a variable FIFO memory is required where a sudden burst of input data may occur which would temporarily overwhelm an individual FIFO memory without a large scale buffer memory (which it can be assumed has almost unlimited memory capacity since it is remote or off the same semiconductor chip as the high speed memory).

FIG. 2 illustrates where some latency may occur; in other words, there would not be a continuous serial transmission of the high-speed data packets through to the switch elements. Thus, the data packets 1, 2, 3 are indicated in a line of data being received. The first data packet is routed to the switching element 7. After this operation is started, a short time later as indicated by the time lapse t.sub.1, data packet two is distributed by the sprinkler engine; and then data packet three at a later time t.sub.2. Some latency occurs which must be compensated for by some type of buffer apparatus.

This is provided by the overall variable FIFO memory which is a combination of a tail FIFO memory 16, a head FIFO memory 17 and the large scale off chip buffer memory 18. Variable blocks of data are formed by a receiver 11 and transferred through the tail FIFO memory to the head FIFO memory 17 until it is filled. Thus, the tail or FIFO 16 routes data to the head FIFO memory 17 which then distributes data packets to the various switching elements. If the head FIFO memory becomes full, the tail FIFO memory will start filling. The tail FIFO will buffer enough data to keep the head FIFO filled. If the tail FIFO fills due to a sudden burst, data is then written on the line of 21 to the large scale off chip memory 18. This data will be read from the large scale memory into the head FIFO when the head FIFO starts to empty.

From a practical standpoint to operate at the data rate of 10 Gbps, tail FIFO 16 and head FIFO 17 are located on a common semiconductor substrate or chip with the large scale buffer memory 18 being remotely located off chip. This is indicated by the dash line 22. When the tail FIFO memory becomes full then the large scale off chip buffer memory 18 is utilized. Uniform blocks of data are stored indicated by the dash line 23. For example, 128 bytes is transferred on the line 21 into the memory 18. This memory also includes a similar block size of 128 bytes. For example, line 21 may have a 64 bit width (meaning eight bytes) and thus, the data block of 128 bytes is transferred in 16 clock cycles (16.times.64=128 bytes). Optimization of the bus width in all of the FIFO and buffer memories provide, in effect, a 100 percent efficient transfer technique since for every clock cycle a maximum number of bits is transferred. However buffer memory 18 has a lower clock rate and therefore wider bus. In the present application this could be two read and two write cycles. The various write pointers and read pointers (WP and RP) are so indicated on the various memories and the overall control is accomplished by the memory controller 26. A multiplexer 27 connected to memory controller 26 provides for control of the various data routings. When a sudden burst of data packets ceases, the FIFO memory can then return to its ordinary mode of operation, wherein the head FIFO memory 17 contains all of the inputted data packets as delivered by the tail FIFO memory. Of course, this does not occur until the large scale off chip buffer memory 18 is unloaded.

The foregoing operation is shown in a flow chart of FIG. 3. In step 41 the head FIFO memory is filled, and in step 42, if the head FIFO overflows, the tail FIFO memory is filled. Then in step 43, again when the tail FIFO is filled, data is stored in the buffer memory until the head FIFO begins to empty. In general, memory controller 26 monitors the FIFO depth and determines if a block of data needs to be stored to off chip memory. It also keeps track of how many blocks are written. As the FIFO memories empty, the memory controller is responsible for arbitrating and retrieving any stored blocks of data. The larger external buffer memory 18 can be provisioned, using one of many allocation schemes, to support multiple head and tail FIFOs in the same manner as described. Thus, multiple variable FIFO memories with head and tail caching are provided.

FIG. 4 shows one embodiment of a head and tail caching system 400 constructed in accordance with the present invention. The system 400 includes a FIFO circuit 402, a controller 404, and a memory 406. The FIFO circuit 402 includes a tail FIFO memory 408, a head FIFO memory 410 and a multiplexer (mux) 412. In one embodiment, the tail and head FIFOs have 256 bytes of memory for data storage. However, the FIFOs may be of any size depending on the caching application. The mux 412 has two inputs that can each be selectively coupled to a mux output.

During operation of the system 400, a high-speed data stream is received at an input 424 to the tail FIFO. For example, the data stream may have a data rate of 10 Gbps or higher, and may include data frames with varying data lengths, for example, from a few bytes to thousands of bytes per frame. The received data is temporarily stored at the tail FIFO until it is transferred from an output 426 of the tail FIFO to a first input of the mux 412. The mux 412 includes a mux control input 414 that can be used to control the mux to couple the data received from the tail FIFO at the first mux input to a mux output 416 that is coupled to the head FIFO 408. The data is temporarily stored at the head FIFO until it is transferred from an output 418 of the head FIFO on a high-speed transmission path to another data receiving entity. For example, the caching system 400 may transmit data at the same rate the data is received. Thus, in one mode of operation, data received at the tail FIFO flows directly through the mux 412 to the head FIFO where it is output to other entities.

The controller 404 is coupled to a fill level indicator 420 of the tail FIFO and a fill level indicator 422 of the head FIFO. The fill level indicators allow the controller 404 to determine how much memory space is being used and how much memory space is available at the tail and head FIFOs. The controller 404 is also coupled to the tail FIFO output 426, so that the controller can receive data output from the tail FIFO 410.

The memory 406 is preferably a large buffer memory that provides more memory space than that provided by the tail and head FIFOs. However, the memory 406 may be of any desired size. The memory has a read/write interface 428 that is coupled to the controller 404. As a result, the controller is operable to receive data from the tail FIFO output 426 and write the data into the memory 406 via the interface 428. At some desirable time thereafter, the controller is able to read the data from the memory via the interface 428. In one embodiment, the interface comprises a 128-bit wide data path, however, the data path may be set to any desired width.

The controller 404 also includes an output 432 that is coupled to a second input of the mux 412 to allow the controller to output data to the mux. The controller also generates the mux control signal 414, so that the controller can control the operation of the mux to couple either of the mux inputs to the mux output 416. Thus, in one mode of operation, the controller receives data from the tail FIFO, stores that data into the memory 406, and at some time later, retrieves the data from the memory and outputs that data to the second input of the mux. Furthermore, the controller controls the mux operation, via the mux control 414, to couple the second mux input to the mux output, so that the data flows to the head FIFO, where it is ultimately output at output 418.

Based on the specific application, the system 400 can be configured to include various data path sizes to transfer data from the input to the output to facilitate the caching function. For example, the data input 424 may be a serial or parallel bit stream at a very high data rate (i.e., 10 Gbps). The tail FIFO may operate on the data in the same format as received or may convert the data into a parallel format (i.e., 8-bit byte format) having a byte rate that is less than the input serial data rate. The tail FIFO may output the wider but reduced rate data to the controller, which in turn, may further format the byte data into words having a lower word rate for storage in the memory 406. For example, the write and read data paths to the memory may be 64-bit wide paths, so that the memory 406 may operate at a much slower speed than the FIFOs (410, 408). Thus, it is possible to configure the data paths and the operation of various components to adjust to the transmission rate of the data so that faster or slower components may be utilized.

FIG. 5 shows one embodiment of the controller 404 for use with the head and tail caching system 400. The controller 404 includes a processor 502, a memory interface 504, a tail FIFO fill detector 506, a head FIFO fill detector 508, a tail FIFO data interface 510 and a mux interface 512.

The processor 502 may comprises a central processing unit (CPU) executing program instructions, or may comprise a gate array or stand alone hardware logic, or any combination of software and/or hardware. The processor is coupled to the other components within the controller 404 via a bus 514.

The tail FIFO fill detector 506 couples to the fill level indicator 420 and operates to determine tail FIFO fill information, and to transfer this information to the processor 502 via the bus 514. The head FIFO fill detector 508 couples to the fill level indicator 422 and operates to determine head FIFO fill information, and to transfer this information to the processor 502 via the bus 514. The tail FIFO interface 510 operates to receive data output from the tail FIFO and to output this data on the bus 514 for processing by the processor 502 or for storage in the memory 406 via the memory interface 504.

The memory interface 504 operates to read and write data to the memory 406. During write operations, the data is received at the memory interface 504 via the bus 514. During read operations, the data is read from the memory and placed on the bus 514. The processor 502 operates to control the flow of data to and from the memory interface by providing control instructions via the bus 514.

The mux interface 512 operates to receive mux control instructions from the processor 502 via the bus 514 and transfer these instructions to the mux 412 via the mux control line 414. The mux interface 512 also operates to receive data from the bus 514 and output this data, via output 432 to the second input of the mux 412. Thus, the controller 404, includes a processor 502 and various interface components to control the flow of data from the tail FIFO to the memory, and from the memory to the head FIFO.

In one embodiment included in the present invention, a system is provided for efficient memory utilization. For example, the system provides efficient memory utilization by providing the most efficient utilization of the communication bandwidth to and from the memory. Thus, it is possible for the caching system to receive and transmit data at high data rates, while using slow speed components to perform memory operations during caching.

FIG. 6 shows a diagram 600 illustrating transfer efficiency versus data block size when transferring data to and from a memory, such the memory 406. For example, the efficiency can be measured across the memory interface 428, as indicated in FIG. 4 at 434.

The diagram 600 shows a transfer efficiency indicator on the vertical axis 602, and the number of blocks transferred on the horizontal axis 604. The block size describes an amount of data transferred in a memory access. For example, a single memory access may transfer 4, 8, or 16 bytes of data, or in some cases even more. Furthermore, there may be some overhead associated with each block of data transferred. Thus, the diagram 600 demonstrates that for transfers involving a small numbers of blocks, the block overhead decreases efficiency. The diagram 600 also shows that efficiency decreases when less than full blocks of data are transferred. Additionally, the diagram 600 shows that as the number of blocks transferred increases, the transfer efficiency increases and the effect on efficiency of block overhead decreases. The variation in the efficiency shown in the diagram 600 is referred to as “sawtooth” behavior. The sawtooth behavior results from transferring less than full blocks of data.

FIG. 7 shows a portion of the tail FIFO 410 illustrating how received data is grouped into blocks in accordance with the present invention. Assuming the data shown in FIG. 7 represents data frames received and stored in the tail FIFO. The data frames A, B, C, and D contain varying amounts of data and may include associated data header information. A tail FIFO processor 702 controls the flow of data into and out of the tail FIFO. Also shown is the tail FIFO fill level indicator 420.

In one embodiment, complete data frames are transferred from the tail FIFO to the memory as necessary to performing caching in accordance with the present invention. However, transferring entire frames may result in memory transfer blocks being only partially filled, which decreases transfer efficiency as described above with reference to FIG. 6. However, in another embodiment, the data frames are grouped together to form completely filled memory transfer blocks. For example, a memory transfer block may contain data from one, two or more data frames. As a result, the memory transfer blocks may contain one or more frame boundaries and complete and/or partial frames.

In another embodiment included in the present invention, the received data is grouped into blocks, as shown by block indicators B1-B4. The size of blocks B1-B4 is determined to provided selected memory efficiency. Thus, when data is removed from the tail FIFO for transfer to the memory 406, entire blocks are transferred so that the selected transfer efficiency is achieved. As shown in FIG. 7, the block indicated by B2 includes a frame boundary so that this block contains data from both Frame A and Frame B. By packing the data frames into completely filled blocks, and transferring those complete blocks to and from the memory 406, high memory efficiency is achieved.

However, filling each block may result in data from one frame being contained in more than one block. For example, block B2 in FIG. 7 includes data from Frame A, Frame B and the Frame B header information. When blocks are defined to comprise only a portion of a frame's data, then in one embodiment, the system inserts header information at the block boundary so that the frames may be correctly reassembled in the head FIFO before transmission.

As the caching system operates, the blocks stored in the memory are eventually retrieved and transferred to the head FIFO. Again, the memory transfer blocks are completely full so that the selected efficiency is achieved when the blocks are transferred to the head FIFO.

In one embodiment, the memory interface 428 has a 128-bit wide data path. This data path width can transfer sixteen bytes of data to or from the memory. The sixteen data bytes define a data word. To achieve a selected efficiency, a block is determined to comprise four data words for a total of sixty-four data bytes. In one embodiment of the invention, the transfer efficiency can be selected by varying the number of blocks transferred at one time. For example, one level of efficiency can be achieved by transferring one block to the memory at a time. Another level of efficiency is achieved by transferring multiple blocks to the memory at one time.

FIG. 8 shows one embodiment of a flow diagram 800 for operating a head and tail caching system in accordance with the present invention. For the purposes of this description, it will be assumed that the caching system is incorporated into a network transmission path for caching data transmitted in the network.

At block 802, data is received at the tail FIFO for caching. At block 804, a memory transfer efficiency is selected and a corresponding block size is determined. For example, the selected efficiency level may result in a block size of four words, and where each memory access transfers two blocks.

At block 806, a determination is made to determine whether the fill level of the head FIFO will allow additional data to be transferred from the tail FIFO to the head FIFO. If the head FIFO has space available, then the method proceeds to block 810. If the head FIFO does not have space available, then the method proceeds to block 812. For example, the controller makes the determination from the head FIFO fill indicator 422. In one embodiment, the head FIFO will receive a transfer from the tail FIFO when there is enough free space in the head FIFO to accommodate one or more blocks of data.

At block 808, data is transferred from the tail FIFO to the head FIFO. Once reaching the head FIFO, the data will ultimately be output on the output data transmission path. During this time, input data continues to be received by the tail FIFO, and so the method proceeds to block 806.

At block 810, data is accumulated in the tail FIFO to form one or more blocks. For example, as shown in FIG. 7, data is accumulated to form blocks B 1-4. Furthermore, the block definitions may cross over data frame boundaries as necessary. For example, the data is packed into the blocks so that a block may contain data from more than one data frame. The controller 404 determines how many blocks of data are currently in the tail FIFO from the tail FIFO fill indicator 420.

At block 812, the number of data blocks determined to achieve the selected efficiency level are transferred from the tail FIFO to the memory. For example, the controller 404 removes blocks of data from the tail FIFO and transfers the blocks of data into the memory 406 via the memory interface 428.

At block 814, a determination is made whether the fill level of the head FIFO will allow data blocks to be transferred from the memory to the head FIFO. If there is not enough space available in the head FIFO, the method proceeds to block 812, where blocks of data continue to form in the tail FIFO. If there is enough space in the head FIFO, the method proceeds to block 818. For example, the controller 404 makes this determination from the head FIFO fill indicator 422.

At block 816, blocks of data are transferred from the memory to the head FIFO for output on the output transmission path. The same number of blocks is transferred from the memory to the head FIFO as were transferred from the tail FIFO to the memory. This results in the selected memory efficiency being achieved.

Although described in sequential fashion, the method operates in a parallel fashion so that while data is continually received at the tail FIFO, other data stored in the memory is transferred to the head FIFO. Thus, the present invention is not limited to the method steps and sequence described with reference to FIG. 8.

FIG. 9 shows one embodiment of a head and tail caching system 900 for processing multiple data streams in accordance with the present invention. In the system 900, multiple caching circuits 902(1,2,3) are used to receive multiple input data steams, shown as Data In (1,2,3). The caching circuits are coupled to a controller 904 that is further coupled to a memory 906. The memory is divided into memory regions to be used for each cache. The controller operates to transfer blocks of data from the tail FIFOs associated with the caching circuits to associated region in the memory 906. The transfers are done so that selected memory efficiency is achieved. The tail FIFO data is blocked into completely full blocks to achieve the selected efficiency. Thus, in accordance with the present invention, a caching system for caching multiple data streams is provided.

The present invention includes a head and tail caching system for reduced sawtooth behavior. The embodiments described above are illustrative of the present invention and are not intended to limit the scope of the invention to the particular embodiments described. Accordingly, while several embodiments of the invention has been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit or essential characteristics thereof. Accordingly, the disclosures and descriptions herein are intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims. 

1. An apparatus comprising: a memory interface operable to transfer an actively-determined number of data units having an actively-determined unit size a first memory having a first input for receiving incoming data a first output for transferring data units a second output for transferring data units to the memory interface a second memory having a second input for receiving data units from the memory interface a third output for transferring data units a third memory having a third input for receiving data units from the first output and the third output a fourth output for outputting data wherein the actively determined number of data units having the actively determined unit size together provide a maximum memory transfer efficiency level for the memory interface.
 2. The apparatus of claim 1 wherein the data units comprise frames of variable length the first memory stores data in at least one fixed size.
 3. The apparatus of claim 1 wherein the maximum memory transfer efficiency level is achieved at least in part by maximizing an amount of data taken from the first memory and put in the data units.
 4. The apparatus of claim 1 wherein the maximum memory transfer efficiency level is achieved at least in part by maximizing an amount of data destined for the third memory in the data units.
 5. The apparatus of claim 1, wherein the third memory further comprises a third memory fill indicator to indicate a fill characteristic of the third memory.
 6. The apparatus of claim 5 wherein the third fill indicator further comprises a first fill level wherein the determined number of data units having the determined unit size are transferred from the first memory to the third memory a second fill level wherein the determined number of data units having the determined unit size are transferred from the first memory to the second memory via the memory interface a third fill level wherein the determined number of data units having the determined unit size are transferred from the second memory to the third memory.
 7. The apparatus of claim 1, wherein the first memory further comprises a first memory fill indicator to indicate a fill characteristic of the first memory.
 8. The apparatus of claim 7 wherein the first fill indicator further comprises a first fill level wherein the determined number of data units having the determined unit size are transferred from the first memory to the third memory a second fill level wherein the determined number of data units having the determined unit size are transferred from the first memory to the second memory via the memory interface.
 9. The apparatus of claim 1, wherein the incoming frames are of varying length and where the determined number of units are defined to include data from one or more of the frames, and wherein a determined unit may contain data from two or more frames.
 10. The apparatus of claim 1, wherein a data path to the second memory is wider than a width characteristic of the third memory.
 11. The apparatus of claim 1, wherein the first memory and third memory reside on a common semiconductor substrate, and wherein the second memory is remote to the semiconductor substrate.
 12. The apparatus of claim 1 wherein the first memory and the third memory are selected from the group comprising first-in-first-out memories and data buffer memories.
 13. The apparatus of claim 1 wherein the second memory is selected from the group comprising on-chip Dynamic Random Access Memories, off-chip Content-Addressable Memories and off-chip Static Random Access Memories.
 14. A method comprising: actively determining a number of data units and a unit size to support a maximum efficiency level transferring the determined number of data units having the determined unit size from the first memory to a second memory when the second memory is within a first fill level transferring the determined number of data units having the determined unit size from the first memory to a third memory when the second memory is within a second fill level transferring the determined number of data units having the determined unit size from the second memory to the third memory when the second memory is within a third fill level.
 15. The method of claim 14 wherein the data units include data from one or more frames of varying size at least one data unit contains data from at least two frames.
 16. The method of claim 14 wherein transferring the data units is based on a second memory fill indicator associated with the second memory.
 17. The method of claim 14 wherein transferring the data units is based on a first memory fill indicator associated with the first memory.
 18. The method of claim 14 wherein transferring the data units to the second memory includes controlling whether data units from the first memory or the third memory are transferred to the second memory.
 19. An apparatus comprising: a first memory for receiving frames of varying amounts of incoming data a second memory a third memory coupled to the first memory and the second memory and operable to receive data either from the first memory or from the second memory a controller coupled to the first memory, the second memory and the third memory and operable to control transfer of data wherein an actively determined number of data units having an actively determined unit size is transferred between the first memory and the second memory and the third memory at a maximum memory transfer efficiency level.
 20. The apparatus of claim 19 wherein the data units comprise frames of variable length the first memory stores data in at least one fixed size.
 21. The apparatus of claim 19 wherein the maximum memory transfer efficiency level is achieved at least in part by maximizing an amount of data taken from the first memory and put in the data units.
 22. The apparatus of claim 19 wherein the maximum memory transfer efficiency level is achieved at least in part by maximizing an amount of data destined for the third memory in the data units.
 23. The apparatus of claim 19 wherein the third memory further comprises a third memory fill indicator to indicate a fill characteristic of the third memory.
 24. The apparatus of claim 23 wherein the third fill indicator further comprises a first fill level wherein the determined number of data units having the determined unit size are transferred from the first memory to the third memory a second fill level wherein the determined number of data units having the determined unit size are transferred from the first memory to the second memory via the memory interface a third fill level wherein the determined number of data units having the determined unit size are transferred from the second memory to the third memory.
 25. The apparatus of claim 19, wherein the first memory further comprises a first memory fill indicator to indicate a fill characteristic of the first memory.
 26. The apparatus of claim 25 wherein the first fill indicator further comprises a first fill level wherein the determined number of data units having the determined unit size are transferred from the first memory to the third memory a second fill level wherein the determined number of data units having the determined unit size are transferred from the first memory to the second memory via the memory interface.
 27. The apparatus of claim 19 wherein the incoming frames are of varying length and where the determined number of units are defined to include data from one or more of the frames, and wherein a determined unit may contain data from two or more frames.
 28. The apparatus of claim 19 wherein a data path to the second memory is wider than a width characteristic of the third memory.
 29. The apparatus of claim 19 wherein the first memory and third memory reside on a common semiconductor substrate, and wherein the second memory is remote to the semiconductor substrate.
 30. The apparatus of claim 19 wherein the first memory and the third memory are selected from the group comprising first-in-first-out memories and data buffer memories.
 31. The apparatus of claim 19 wherein the second memory is selected from the group comprising on-chip Dynamic Random Access Memories, off-chip Content-Addressable Memories and off-chip Static Random Access Memories.
 32. A method comprising: transferring an actively determined number of data units having an actively determined unit size from a first memory to a second memory, the first memory having an input to receive frames containing varying amounts of data and the second memory having an output to output data transferring the actively determined number of data units having the actively determined unit size from the first memory to a third memory via a memory interface transferring the actively determined number of data units having the actively determined unit size from the third memory to the second memory wherein the actively determined unit size and the actively determined number of data units together provide a maximum memory transfer efficiency level for the memory interface.
 33. The method of claim 32 wherein the data units include data from one or more frames of varying size at least one data unit contains data from at least two frames.
 34. The method of claim 32 wherein transferring the data units is based on a second memory fill indicator associated with the second memory.
 35. The method of claim 34 wherein transferring the data units further comprises transferring the data units from the first memory to the second memory at a first fill level transferring the data units from the first memory to the third memory at a second fill level transferring the data units from the third memory to the second memory at a third fill level
 36. The method of claim 32 wherein transferring the data units is based on a first memory fill indicator associated with the first memory.
 37. The method of claim 36 wherein transferring the data units further comprises transferring the data units from the first memory to the second memory at a first fill level transferring the data units from the first memory to the third memory at a second fill level
 38. The method of claim 32 wherein transferring the data units to the second memory includes controlling whether data units from the first memory or the third memory are transferred to the second memory.
 39. An apparatus comprising: a first memory a second memory a memory interface coupled to the first memory and to the second memory operable to copy an actively determined number of data units having an actively determined unit size from the first memory to the second memory wherein the determined unit size and the determined number of data units together provide maximum memory transfer efficiency.
 40. The apparatus of claim 39 wherein the data units comprise frames of variable length the first memory stores data in at least one fixed size.
 41. The apparatus of claim 39 wherein the maximum memory transfer efficiency level is achieved at least in part by maximizing an amount of data taken from the first memory and put in the data units.
 42. The apparatus of claim 39 wherein the maximum memory transfer efficiency level is achieved at least in part by maximizing an amount of data in the data units destined for output.
 43. The apparatus of claim 39 further comprising a third memory coupled to the first memory and the second memory.
 44. The apparatus of claim 43 further comprising a third memory fill indicator to indicate a fill characteristic.
 45. The apparatus of claim 44 wherein the third memory fill indicator further comprises a first fill level wherein the determined number of data units having the determined unit size are transferred from the first memory to the third memory a second fill level wherein the determined number of data units having the determined unit size are transferred from the first memory to the second memory via the memory interface a third fill level wherein the determined number of data units having the determined unit size are transferred from the second memory to the third memory.
 46. The apparatus of claim 39 wherein the first memory further comprises a first memory fill indicator to indicate a fill characteristic.
 47. The apparatus of claim 39 wherein the first memory fill indicator further comprises a first fill level wherein the determined number of data units having the determined unit size are outputted from the first memory. a second fill level wherein the determined number of data units having the determined unit size are transferred from the first memory to the second memory via the memory interface.
 48. The apparatus of claim 39, wherein the incoming frames are of varying length and where the determined number of units are defined to include data from one or more of the frames, and wherein a determined unit may contain data from two or more frames.
 49. The apparatus of claim 43, wherein a data path to the second memory is wider than a width characteristic of the third memory.
 50. The apparatus of claim 43, wherein the first memory and third memory reside on a common semiconductor substrate, and wherein the second memory is remote to the semiconductor substrate.
 51. The apparatus of claim 39 wherein the first memory is selected from the group comprising first-in-first-out memories and data buffer memories.
 52. The apparatus of claim 39 wherein the second memory is selected from the group comprising on-chip Dynamic Random Access Memories, off-chip Content-Addressable Memories and off-chip Static Random Access Memories.
 53. The apparatus of claim 43 wherein the third memory is selected from the group comprising first-in-first-out memories and data buffer memories. 